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Abstract 

The sensitivity of future neutrino oscillation experiments is determined within a 
frequentist framework by using a statistical procedure based on Monte Carlo simula- 
tions. I consider the search for a non-zero value of the mixing angle #13 at the T2K 
and Double-Chooz experiments, as well as the discovery of CP violation at the example 
of the T2HK experiment. The probability that a discovery will be made at a given 
confidence level is calculated as a function of the true parameter values by generating 
large ensembles of artificial experiments. The interpretation of the commonly used 
sensitivity limits is clarified. 
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1 Introduction 



The investigation and comparison of the physics potential of future neutrino oscillation facil- 
ities has by now become an industry. Extensive studies of sensitivities of future experiments 
to neutrino parameters are performed, see for example [1]. A widely used procedure for such 
calculations — in the following referred to as standard procedure — is to assume some input 
values ("true values") for the oscillation parameters for which the predictions for the ob- 
servables in a given experiment are calculated without statistical fluctuations. Then these 
predictions are used as "data" and a statistical analysis of these data is performed to see 
how well the input values for the parameters can be reconstructed by the experiment. This 
procedure should give the sensitivity of an "average" experiment, where "average" lacks a 
precise definition. 

In this letter I will clarify the correct interpretation of such sensitivities. Focusing on 
the sensitivity to a non-zero value of the lepton mixing angle #13 the potential of a future 
experiment will be quantified by answering the following question: 

Given a true value of 9^, what is the probability that the hypothesis # 13 = can 
be excluded at a certain confidence level? 

This generalises the usual sensitivity limits to a well defined statistical statement and will 
allow also a precise definition of the term "average experiment". For example one may 
define the sensitivity of an average experiment as the value of for which # 13 = can be 
excluded with a probability of 50%. 

As realistic examples I will consider the T2K [2] and Double-Chooz [3] (D-Chooz) exper- 
iments. Details of the simulation and assumptions on the experimental configurations are 
given in Sec. [2j From the statistical point of view there is an important difference between 
the two settings. In T2K one looks for the appearance of v e events from the — > v e oscil- 
lation channel on top of the intrinsic v e background in the beam. Therefore it is similar to 
the problem of a Poisson signal over a background. In contrast, D-Chooz is a disappearance 
experiment at a nuclear reactor comparing the v e signal in a near and far detector where in 
both detectors a very large number of events is obtained, and one looks for a small difference 
in these large numbers beyond the geometrical scaling with the distance. Hence one may 
expect that in this case Gaussian approximations are well justified. 

In Sec. H2 the sensitivity of T2K and D-Chooz to sin 2 2# 13 is considered by performing a 
Monte Carlo (MC) simulation of the experiments. A large number of artificial data sets is 
generated to calculate the actual distribution of the statistics used to decide whether #13 = 
should be rejected at a given confidence level (CL). This will allow to answer the question 
stated above within a well defined frequentist framework. Moreover one does not rely on 
questionable assumptions necessary in the standard procedure, for example issues related to 
the non- linear character of the parameters, the periodicity of the CP phase #cp, the physical 
boundary sin 2 26>i 3 > 0, assuming standard ^-distributions, and the question of how many 
degrees of freedom to use for them. 
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In Sec. H]I extend these methods to the case of CP violation searches, where the example 
of the T2HK experiment is considered. This case is especially interesting from the statistical 
point of view, since the quantity of interest "CP violation" has rather non-standard statistical 
properties since it is directly related to the periodic variable 5qp, and the assumption of a 
linear parameter dependence of the observables which is implied by the use of standard 
^-distributions is not justified a priori. 

2 Description of the experiment simulations 

For the simulation of the D-Chooz reactor experiment the information available in the pro- 
posal Ref. [3] is used. For the far detector at distances of 998 m and 1115 m from the two 
reactor cores an exposure of 5 years with 60.5% efficiency is assumed. The near detector at 
an equal distance of 280 m from both cores will come online somewhat later and I take 3 years 
of data with 43.7% efficiency. This gives in total 75 000 events in the far and 473 400 events 
in the near detector [3]. I take into account an uncertainty on the reactor neutrino flux of 2% 
(uncorrelated between the two cores) and assume an error in the relative normalisation of the 
two detectors of 0.6%. Furthermore I include a background of 3.6% (2.7%) in the far (near) 
detector which is known within an uncertainty of 20%. A fit is performed for the two oscil- 
lation parameters sin 2 2# 13 and Am^, where always the true value Am^ = 2.5 x 10~ 3 eV 2 is 
adopted and external information on Am^ with an accuracy of 5% at la is assumed. More 
details of the reactor simulation can be found in Ref. [4] and the standard limits are in good 
agreement with Ref. [3]. 

For the generation of artificial data for the D-Chooz experiment I assume that the sys- 
tematical uncertainties are random variables, i.e., for the flux normalisations from the two 
reactor cores and for the normalisations of the two detectors Gaussian variables are thrown 
with the errors given above. Then the expected event number in each bin of each detector 
is shifted accordingly, and this shifted value is used as mean for generating the "observed" 
event number in this bin according to a Poisson distribution. 

For the simulation of the T2K and T2HK experiments I follow closely the setup provided 
within the GLoBES software package [5] which is based on information from Ref. [2]. How- 
ever, in order to be able to perform the MC simulation for the analysis presented here a 
dedicated code has been developed which drastically reduces the required calculation time. 
To this aim the following simplifications have been adopted. The oscillation parameters 
sin 2 #i2, Am|i, and Am| x are fixed to their assumed true values 0.3, 7.9 x 10~ 5 eV 2 , and 
2.5 x 10~ 3 eV 2 , respectively. I analyse only the appearance channel, the disappearance chan- 
nel is taken into account implicitly by fixing Am^ and assuming an external uncertainty 
on sin 2 #23 of 0.08 (0.04) at la for T2K (T2HK). The true value of sin 2 6 23 is assumed to 
be 0.5, which implies that the octant degeneracy is absent. I do not take into account the 
sgn(Am| 1 ) degeneracy and assume always normal neutrino mass hierarchy. 

For the T2K simulation I assume 5 years of data taking in the neutrino channel, a 
0.76 MW beam, and the 22.5 kt fiducial mass of the SK detector. Signal and background 
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events are normalised to the standard GLoBES setup [5] and systematical errors of 5% are 
assumed on signal and background. Despite the simplifications mentioned above standard 
sensitivity limits of this T2K simulation are in excellent agreement with the ones calculated 
with GLoBES. The T2HK implementation follows the setup adopted in Ref. [6], which 
consists of a 4 MW beam, the HK detector with 440 kt mass, 2 years of neutrino and 8 
years of anti-neutrino data. Uncorrelated systematical errors of 5% are included for the 
signal and background in neutrino and anti-neutrino running. In the case of T2HK minor 
differences of standard sensitivity limits appear between the simulation used here and the 
GLoBES implementation due to the adopted simplifications and other subtle differences in 
the analysis. 

Artificial data sets for T2K and T2HK are generated in the following way. Assuming true 
values of # 13 and Sep the predicted spectrum in reconstructed neutrino energy is calculated 
using a Gaussian energy resolution of 85 MeV due to Fermi motion for quasi-elastic (QE) 
events and no energy information for non-QE events (i.e., energy smearing with uniform 
distribution). The QE and non-QE event spectra are added taking into account the relevant 
ratio of cross sections. Systematical uncertainties are considered as random variables, i.e., 
for each systematic a Gaussian variable is thrown with an error of 5% and the signal and 
background spectra are rescaled accordingly. Then the "observed" number of events in each 
bin is generated from a Poisson distribution with the mean corresponding to the systematic- 
shifted predicted event numbers. 



3 Sensitivity to #13 

In this section I am going to answer the question quoted in the introduction considering the 
T2K and D-Chooz experiments. I start by describing in some detail the procedure used for 
this purpose. 

First one has to define a criterion to decide whether (real or artificial) data are consistent 
with the hypothesis #13 = 0. I will use a test based on the likelihood ratio 

2 In C{ ^1 Q) = X 2 (0n = 0) - x min = ^xl , (1) 

where the relation \ 2 = ~ 2 In C between the \ 2 an d the likelihood function of the data 
has been used. x m in = ~~ 2 In £ max are the corresponding values at the best fit point which 
are compared to the values at 6*i 3 = 0. If the statistic (PQ) is larger than a value A(a) the 
hypothesis #13 = can be excluded at the 100(1 — a)% CL. The value A (a) is calculated 
by MC in the following way. Assuming #13 = many artificial data sets are generated. In 
other words, an ensemble of many repeated experiments is considered assuming that the true 
value of #13 is zero. For each artificial data set the \ 2 is minimised to obtain the distribution 
/ (A%q) of the statistic ([T]). Given this distribution A(a) is defined by the requirement that 
a fraction a of all experiments will have A%q > A(a): 

/•oo 

a = dx fo(x) . (2) 

J\(a) 
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Figure 1: Illustration how the probability is obtained that T2K and D-Chooz will discover the value 
sin 2 2#i3 = 0.02 at 99.73% CL. The solid (dashed) curves correspond to the CDF of the distribution /q (/e 13 ) 
of A X l generated for a value 9 13 = (sin 2 26 13 = 0.02 and <S C p = 108°). For comparison also the CDF 
of a ^-distribution for 1 degree of freedom is shown. The vertical and horizontal lines indicate how the 
probability Pdisc is obtained, see text for details. 

The cumulative distribution function (CDF) of /o is shown in Fig. [T]as solid curves for T2K 
and D-Chooz. In the Gaussian approximation Axl should be distributed according to a x 2 - 
distribution for 1 degree of freedom. As visible in Fig. [1] for the examples under consideration 
there are some deviations from this situation, where the difference is larger for T2K. The 
cut value X(a) for the 99.73% CL (which is equal to 9 for the ^-distribution) is 8.23 for 
D-Chooz and 7.55 for T2K. 

If the experiment had been performed already and real data were available one would now 
check if x 2 ($i3 — 0) — Xmindata i s larger than A(a) to decide whether the hypothesis 6*13 = 
can be excluded^ In the absence of real data one can, however, calculate the probability 
for this to occur as a function of the value of 6*13. More precisely, assuming a fixed value 
of #13 (and in case of T2K also of <5cp) many artificial data sets are generated. This yields 
the distribution f 9l3 (Axl) under the assumption of the "true value" #13. The probability 
-Pdisc(a, #13) that #13 = can be excluded at the 100(1 — a)% CL is given by 

POO 

PdiscKM = P [Axl > A(a) | 6 13 ] = / dx f 6l3 (x) . (3) 

Jx(a) 

1 Note that the test for the hypothesis 6*13 = used here is equivalent to ask whether the point #13 = is 
contained in the 100(1 — a)% CL region constructed according to the Feldman-Cousins prescription [7]. 



5 



Probability to establish a non-zero value of 9 n at 99.73% CL 
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Figure 2: The probability to exclude the hypothesis 9 13 = at the 99.73% CL for T2K and D-Chooz as 
a function of the true value of sin 2 2#i 3 . The two curves for T2K correspond to the true values Sep = 108° 
and 288°. The vertical lines show the corresponding "standard" sensitivities. The dashed curves correspond 
to the probability Pdisc calculated in the Gaussian approximation according to Eq. Q. 



The calculation of Pdisc is illustrated in Fig. [Q assuming that the true value of sin 2 26*13 is 
0.02. One can read off from this figure that the probability to exclude the hypothesis 9 V3 = 
at 99.73% CL is about 29% for T2K (if S CP = 108°) and 9.7% for D-Chooz. 

Now one has to scan over the true values of (and in the case of T2K also over 5cp), 
repeating the procedure outlined above in each point. Fig. [2] shows the probability Pdisc to 
exclude the hypothesis #13 = at the 99.73% CL for T2K and D-Chooz as a function of 
the true value of sin 2 2^ 13 . For each true value 3 x 10 6 data sets have been simulated. For 
T2K the two values chosen for <5cp correspond roughly to the best and worst sensitivity. 
The vertical lines in the plot show the standard sensitivities calculated from the condition 
Ax 2 > 9 without statistical fluctuations. One observes that for D-Chooz the standard sen- 
sitivity corresponds indeed with good accuracy to Pdi SC = 50%, as expected for an "average" 
experiment. For T2K the discovery probabilities corresponding to the standard sensitivities 
are actually slightly higher, around 60%. 

The dashed curves shown in Fig. [5] are obtained assuming a Gaussian measurement of 
sin 2 2^ 13 . In this case Pdi SC can be obtained in terms of the error function in the following 
way. Assuming that x is a Gaussian variable with standard deviation o the hypothesis x = 
can be excluded at the 99.73% CL if the observed value x obs is bigger than 3a. On the other 
hand, the probability for x obs > 3a as a function of the true value x true is easily calculated 
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Probability to observe a non-zero 8 13 at 99.73% CL in T2K 




true sin 20 



Figure 3: Contours of the probability Pdi sc to establish a non-zero value of 6 3 i at the 99.73% CL for T2K 
in the sin 2 28\ 1 ^ 1C -S^p c plane. The dashed curve corresponds to the "standard sensitivity limit" . 



as 

poo 1 r / o _ true \ " 

P [x ohs > 3a | x true ] = / dx G{x; x true , a) = - 1 - erf — -f- , (4) 

where G(x; x true , a) denotes the normal distribution with mean x true and standard deviation 



a. 



The dashed curves in Fig. [5] have been obtained from Eq. by identifying sin 2 29 13 = x 
and by using for a one third of the 99.73% CL sensitivity limit from the standard procedure. 
One observes that for D-Chooz this approximation is excellent. Hence, in this case sin 2 2#i3 
can be considered indeed as a Gaussian variable and the probability Pdisc can be calculated 
from the standard sensitivity limit and Eq. (jlj) without the need of a MC simulation. In 
contrast, for T2K some deviations from Gaussianity are visible (especially for = 288°). 
This is not unexpected, since in this case event numbers are small, background fluctuations 
are important, and the dependence of the observables on the parameters is much more 
complicated than in the case of D-Chooz. 

Contours of the probability Pdisc for the T2K experiment in the plane of sin 2 26^3 Ue and 
5^p e are shown in Fig. El Pdi SC has been calculated for a grid of 41 x 41 values and at each 
point in the grid 10 5 data sets have been generated, leading in total to nearly 1.7 x 10 8 
performed fits. This figure is the generalisation of the usual sensitivity limit (shown as 
dashed curve) and for each true value of the parameters one can infer the probability that 
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T2K can establish a non-zero value of #13 at the 99.73% CL. As indicated already in Fig. [2] 
one finds that the standard sensitivity curve corresponds roughly to a discovery probability 
of 60%. The region where # 13 > can be established with high probability, let's say greater 
than 99%, is found for sin 2 > 0.0166 — 0.041, depending on the true value of 5qp- It is 
shifted with respect to the standard sensitivity limit to values of sin 2 2#i3 larger by roughly 
a factor of 2. 

4 Sensitivity to CP violation 

In this section I will apply similar concepts to the sensitivity to CP violation (CPV), using 
as example the T2HK experiment. Now the relevant question is: 

Given true values of 9^ and b~cp, what is the probability that CPV can be estab- 
lished at a certain confidence level? 

In this case the hypothesis to be excluded, CP conservation (CPC), is more complicated than 
in the case of the 6*13 sensitivity. Whereas in the previous case the hypothesis was just a 
single point in the parameter space (#13 = 0, independent of Sep), now one wants to exclude 
5qp = and Sep = vr for any value of #13. Therefore, in the following the phrase "establish 
CPV at the 100(1 — a)% CL" is considered to be equivalent to the statement that for any 
value of sin 2 2#i3 neither 6~cp = nor 5qp = vr is contained in the confidence regions in the 
sin 2 2^i3-5cp plane at the 100(1 — a)% CL. For the construction of the confidence regions 
the Feldman-Cousins method [7] will be used . 

This method is implemented as follows. To decide whether given data are consistent with 
CPC at the 100(1 - a)% CL it is checked if the statistic 

A XCPC(#13, <5CPC) = X 2 (#13, <5CPC) - Xmin with ^CPC = 0, 7T (5) 

fulfils the condition 

AXcfc^is^cpc) > X(a; 9 13 ,5 C pc) (6) 

for all values of #13, where X(a; #i3,5cpc) is calculated by MC simulation. Many artificial 
data sets are generated for the parameters 5qp = 0,tt and 6*13, to map out the distribution 
/ C Pc(Axcp C I 13 ,5 C pc)- Then \(a; 9 13 ,5 C p C ) is determined bj@ 

POO 

a= dx f C pc(x 1 0i3, 5 C pc) ■ (7) 

J Hot; 6>i3,<5 C pc) 

In Fig.H]A(a; 6*13, Scpc) is shown for a = 0.01. In contrast to the values 6.635 (9.21) following 
from ^-distributions for 1 (2) degrees of freedom the actual cut values vary between 4 and 
8.1 in the considered range of sin 2 26*13. 

2 Note that in the case of the 813 sensitivity the corresponding distribution /o, and therefore also A (a) 
are independent of the true parameter values. In the case of the CPV sensitivity the explicit parameter 
dependence makes it necessary to refer to the full Feldman-Cousins confidence region construction, whereas 
for the 613 sensitivity this procedure is equivalent to a simple likelihood ratio test. 
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Figure 4: The cut value A(a; 6>i 3 , 5 C pc), see Eqs. @ and for the 99% CL (a = 0.01) as a function of 
sin 2 2#i3 for i5cp = and tt. The horizontal lines show the canonical cut values following from ^-distributions 
for 1 and 2 degrees of freedom. 

To determine the probability Pdi SC that CPV can be established for given true values of 
#13 and 5qp many artificial data sets are generated under the assumption of these parameters. 
For each data set the test (jHJ) is performed and Pdi SC is given by the fraction of data sets for 
which CPV can be established at the corresponding CL. The results of such an analysis for 
the T2HK experiment are shown in Fig. [5J -Pdisc has been calculated for a grid of 41 x 41 
true values and at each point in the grid 5 x 10 4 data sets have been generated. This number 
is somewhat reduced with respect to the one used for the #13 sensitivity to keep the analysis 
feasible. Because of the smaller ensemble the CL is reduced to 99%. The plot is restricted 
to the interval < 5qp < 7T, similar behaviour is expected for 71 < 5qp < 2n. 

Fig. [5] provides the complete information on the possibilities to discover CPV in T2HK. 
For each true value of the parameters one can infer the probability that CPV can be establish 
at the 99% CL. The standard sensitivity limit (obtained from the condition A% 2 > 6.635 
without taking into account statistical fluctuations) is shown as dashed curve. It is re- 
markable that the standard limit is rather close to the contour for Pdisc = 50%, i.e., an 
"average" experiment. The deviations from this value can be motivated from Fig. HI Since 
A (a; #i3,5cpc) is bigger than the canonical value 6.635 for sin 2 26> 13 < 1CT 2 the sensitivity 
to CPV is actually slightly worse leading to Pdi SC < 50% for the standard limit, whereas for 
sin 2 2^i3 > 10~ 2 one has A(a; #13, <5cpc) < 6.635 which leads to better sensitivities to CPV 
and hence Pdi SC > 50% at the standard limit. 

Despite the rather good approximation of the standard sensitivity limit for Pdi SC = 50% 
it is evident from Fig. [5] that one has to be aware of the correct interpretation of such limits. 
The region where CPV can be established with high probability is significantly smaller than 

















1 
















1 


2 .. 








































X tor 2 dot 


























































































































- 




5 CF 





















































8 =71 








































°CP 11 
















2 

% f 


31 1 


lot 






































































































































































































































































































1 
















1 











9 



Probability to establish CPV at 99% CL in T2HK 
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Figure 5: Contours of the probability P d isc to establish CPV at the 99% CL for T2HK in the sin 2 20f 3 ue - 
5qp c plane. The dashed curve corresponds to the "standard sensitivity limit" . 

the standard sensitivity region. For example, maximal CPV 5qp = tt/2 can be established 
with a probability of more than 99% at the 99% CL for sin 2 2#i 3 > 3.3 x 1(T 3 , whereas the 
corresponding standard sensitivity limit is sin 2 29 13 = 8.5 x 10 -4 , nearly a factor 4 smaller. 



5 Summary 

In this work the sensitivity of future neutrino oscillation experiments has been calculated 
by using a statistical procedure within a frequentist framework based on MC simulations. I 
have determined the probability that a discovery will be made at a given CL as a function 
of the true parameter values by generating a large ensemble of artificial experiments. The 
interpretation of the widely used "standard" sensitivity limits, where statistical fluctuations 
are neglected, has been clarified. 

As examples I have considered the discovery of a non-zero value for 9 13 at the T2K 
and Double-Chooz experiments. It has been found that for Double-Chooz the Gaussian 
approximation is very well justified. The usually calculated sensitivity corresponds to the 
performance of an average experiment (the discovery will be made with a probability of 
50%), and the actual discovery probability can be estimated by a simple formula in terms 
of the error function. In the case of T2K some deviations from Gaussianity are found and 
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the standard sensitivity limits correspond to a discovery probability of about 60%. Similar 
concepts have been applied to the discovery of CP violation in neutrino oscillations at the 
T2HK experiment. A definition of "establishing CP violation" based confidence regions 
according to the Feldman-Cousins prescription has been used. 

For all considered cases I have found that standard sensitivity limits provide a reasonable 
approximation for an average experiment, corresponding to a discovery probability of order 
50%. However, one has to be aware of the correct interpretation of such limits. In general 
the region where a discovery can be made with high probability is significantly smaller than 
the one corresponding to the standard sensitivity limits. 
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